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Rroblesa of the Mechanization of 
Scientific Information 

by 

t» lo Guteianakhey, Doctor of Technical Sciences 
Fmas Yestnik Afcademii Jfauk, SSffl, No* 8, (Mg 0 1952) pp 0 46-52 

The problem of the mechanisation of scientific information, posed by 
AmdtmlaSjm A> H 0 Mesmayanov, President of the Academy of Sciences of USSR 
is a knotty one, the solution of which must bring about a fundamental 
in the methods of handling scientific-technical information, will coatribui 
to the optimum organisation of scientific work, and will assist in the aos 
rapid introduction of the acMeves&ents of science and technology into the 
national economy <> 

The continual increase in the number of scientific investigations and 
technical developments has led to a torrential flow of literature, which he 
literally engulfed scientists and engineers® 
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Th® totnl naabcy of printed works accimmluted by b usaanf ty is v&ry 
groat o It is estimated to be of the order of hundreds of millions. At 
the present rates of increase of the quantity of literature, the contents 
of libraries will be almost doubled, every 10 - 15 years. In 50 - 60 years 
an increase of the contents of libraries to 15 - 20 times of their present 
sise may be expected. Specialists in any particular field of knowledge can- 
not follow the progress in allied fields of science and technology. For 
systenatization and selection of literature, a large army of bibliographers 
compiles surveys which aid specialists. 

The present practice requires the solution of complex technical and 
scientific problems in minimal time with recognition, of oil accumulated 
data. Moreover, a substantial part of the time of scientists and specialists 
is spent in the selection of literature and in obtaining exhaustive infor- 
mation on their problems. 

Because of the abundance of data scattered la a vast number of Journals, 
books, and reports, a search for the necessary information sometimes requires 
so much time and work that it is easier to repeat either the experiment or 
the calculation than to find its description in the literature. 

Attempts to classify informational material by means of various 
systems of library classification cannot lead to an efficient solution of 
the question. 
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Classification, as a method of organized disposition of materials, is 
characterized by selection of certain criteria aa a basis for division of 
information into independent "non-overlapping” groups*, ©ms, for example, 
bacteria axe classified according to their morphology, pathological activity, 
conditions governing their lives and growth, etc a 

rn-i cctnpounds are classified on the basis of their composition, 
chemical structure, physical properties, and according to their practical 
application.. Electronic apparatus is classified according to the field of 
application, power, performance, etc» 

Classification suggests principally the variation of certain criteria 0 
It is impossible to utilize all possible ccuMnatlons of a31 the criteria o 

©ae attempt to create such a detailed and finely divided classification 
ms destined to failure so the clasoificati on had to be accomplished by the 
simplest possible questions <> 

theory of combination easily permits the ctetermioc lion of those 
"astronomical" figures, which are obtained by calctOation of the possible 
number of elementary questions in such a system., *his number, even for 
ordinary raw data, will be of the order of IX) to the thousandth power 
Cio 1000 )o 
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Conoaetsd with these paobleam is the review e®i selection of literature 
by every individual scientist and specialist 0 Help in tide work is pro- 
vided by bibliographers , special, deporiafoonto in libraries, institutes, 
miaietries, and services « 

Fax eaosnrple, calculation has been attempted of the amount of work 
necessary for the satisfaction of daily requirotwenta for scientific infer- 
station o 

In the Soviet Union there are several million engineers, technicians, 
scientific workers, etc 0 let each of them require- scientific information 
only once a year (selection of literature concerning an Interesting question),, 
let us consider that in. am year this constitutes about 3*6 million in- 
quiries, at 10 - 30 thousand inquiries per day., 

let us assume that the satisfaction of each requirement for Information 
is derived from material, the volume of which on the average is limited to 
on® thousand pages of text» Let us note also that one person scans 100 pages 
of text in a day* In this case then, for* each selection 10 man-days are re- 
quired® 

Than, for the collation of all of -She work the labor of 100 - 200 thou- 
sand qualified readers is necessary, looking through material in accord with 
a large number of requirements., 
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tfe shall work frai the desired scale of lafoosmtlciMl work, and not from 
the present slt«atlon 0 But even if these; figures are reduced, even then they 
will indicate the large volume of sclentefic-infoi.mtlon work necessary for 
a country o 

She continually rising scientific level of Soviet socialists requires 
a pacoceclure for scientific»lnfoKmtiOE work which maild provide each of them 
on the average not one, as is assumed in our calculation, hut several pieces 
of infosmtion per year* 

It ssay b© cwwte®m& that 'bemuse of inauffieiant information a eignifi- 
cant part of the effort of the staff and facilities of scientific instituiicms 
la wasted on the repetition of investigations which have edraedy been carried 
onto Much, time Is used la tbs selection of jLnfoosation at. the beginning of 
any iBUPge-seale inveBtegahion, because before beriming a new scientific 
research^ it is necessary to become familSar with date from the total 
Uterature* Ibraphrasjng Yo V a Ms^afcovsMy, it osn b® said that a scientist, 
gathering scienteflc^inforasatioa material, investigates a 'thousand tons of 
,, wrd«’©jf.e M for & single nugget of infoxmtiono 

Ssfosmtion materiel acewiul&ted in libraries la a vent potential 
•wealth which brings all the more benefit the better organised is the 
acientific^infoxmation work* 
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j a v* g tfrl l” i adicat.es that the aseetsaaizattm of work process is 
that determining strength, without which it is impossible to maintain 
either our tempo or new seal© of production* This point can fully apply 
to the arrangement of scientific infosasatAoxia 

Ifecbaatmtton by no means Is limited just to the aisaple acceleration 
of the selection of material . Insofar as raemrch may bs directed not only 
toward solution of Question A, occurring together with fensttan B, hut also 
in the solution of B according to A, then this relation cm help in the 
establlalfflant of connections of the type of bond toatweau cause and effects 

The possibility of a quick abstract of accumulated <totn in accord with 
specific questions leads, moreover, to weakening, and. even to the slMmUon 
of the rapidly growing divisions of science « At present, specialists, even 
of allied disciplines, understand one another with difficulty* B&jor dif- 
ficulties are eliminated by searching for and utilisation of analogies in 
phenomena, processes and structures found in different fields* Bie prepara- 
tion of material for im’orc^tioml -bibliographical machines use© gcnetol- 
ia&tiomi of results of the most diverse investigations and tevslopnents < 

According to an ides of A* H„ Hsuteysnov, It la necessary to be able 
mechanically to loot, through the contents of lafunaUan in accordance with 
given retirements, thus resulting in the selection of required material frm 
a number of independent references.. 


Approved For Release 2004/03/31 : CIA-RDP80B01139A00020001 0003-8 



Approved ftSr Release 2004/03/3*1 T ClA-RDP80B0‘U89A00020001 0003-8 


CODIAC-D-12 


Thxm f the contents of every work must be expressed la an abstract of 
a certain number of the simplest, eleraeatary propositions theses, 
f&ets, propositions and criteria* Scientific hypotheses, conceptions, 
results of experiments, constructions, principles of the functioning of 
apparatus, physical constants, titan, place of action and other liiformtioml 
data mat toe concisely and simply foimrlatedo 

In first approximation, to can assume that an abstract of tan average 
journal article will contain 100 « 200 such propositions « 

analysis of all. incoming material, the femalatioa and statement 
of el@w 2 n.tary propositions can toe produced, according to appropriate rules 
and specifications , toy the authors of the articles or fey special workers* 

A large part of 'this work can toe aeccsnplished fey the newly organized 
Institute of Scientific Information toy preparation of abstracts according 
to the different divisions of science and engineering*, 

‘J2be selection of aecimOated material for a given question most toe 
done mechanically o 

Questions must also toe put in the form of the simplest elementary 
propositions <, A large number of such questions can toe made sirmiltaneously® 
During the search of the iafeamatiem, the machine must select only those 
pieces of infomation which sinnaltaneously axe pertinent to all the questions 
posed o 
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f Bm answer must contain a list of selected, works (in the form of a 
list of Issues of these works in a bibliographical order) end give their 
contents 0 

The problem of obtaining photostats or onlglmls of the selected works 
must be solved separately <> This problem my be solved also by means of 
a«to»tlmtlon (photcmtomats and other apparatus |« 

The firodammtal jxsxtblim. of an sutoauktle device (fear brevity -we shall 
designate this device ■»««» machine) «»« Is the selection of a bibliography 
according' to a conMn&tim off a number of reqnulr&msnts (cijaestions)o With, 
a large muatoer of these requirements, the nasrsber of possible caa£blmtlcais la 
psactlcally infinite « 

3fe certain c mm 9 of course, there can be a cemblmatlon, which la 
not mlectad i» any of the materials t A negative answer to such a <pestlo& 
la also useful because it testifies to the newness off the problem to research 
or devel ppaento 

towff&r as the selection is controUod by the contents off material, 
the amt-Mm, according to the design ,» can answer the most varied question# 
in Kjy ccaibimtion off .given requiraaentSo 

tat us select and process, for example, Information about the physical 
constants off molecules o The problem for the machine obviously will be 
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formulated in the following ways to find, by annual volmfi of Issues, 
publications in which certain constants of molecules of determined ©on- 
position axe Important and within certain limits of variation*, The number 
of criteria may vary from one to an infinite number □ 

However, the possibilities of a machine, built according to these 
principles are very broad* In a somber of the inquiries,, first interest 
may not be in a bibliography, but rather in an analysis of the very contents 
of the publications* 

Let us assume, for example, the development of infosmtioral material 
concerning chemical kinetics and data on the mechanism of chemical reactions 

She tasks of mechanical selection of information can be formulated as 
follows s 

lo to find papers in which s low reactions are dismissed, i*©*, those 
reactions where the pre-exponential term is important within 
defined, limits o 

2= to find, papers, in which is indicated that a certain reaction 
proceeds with the velocity specified in the question and also 
data concerning temperature and concentration* 

©ms, it is; possible to set requirements, the numerical value of 
certain criteria connected with them, and to require a report of other 
associated information (of all or part of it)* 


Approved For Release 2004/03/31 : CIA-RDP80B01139A00020001 0003-8 



ApproveeHPbr Release 2004/0*3/?! :“ciA-RDP80BQ*139A00020001 0003-8 

(WDJAC-W2 


M fiMt. 1 * oihisWO* lladtetleo of an •**"* 1 W 3“* « 
certain But of criteria without an tndioati* of s«ree 1» neenlngleaa. 

B* if tt is rsaenbered wtot a hage quantity of »terlal .say be eraadnel 

banl call y . then the erpedteaey and efrecUtenese of such a method of 

utilisation of the machine tecasne ewia^ato 

A matter often cannot wait and may retire an srg eat answer to a 
question ccmcerrrfng relati^Mpa of certain qyaantities, «orrospendenoe 

or discrepancy of a series of criteria, etc for ^ ^ 

ease indicated above* it is interesting to check whether there are in- 
consistent data for osr^Bu^fiding r*ectis®a under the saae conditions ®r 
Tsader different conditions (for example a rmsUm in a t®*erat*sra range 
^f®na« to a vsml kinetic eqj^tim, tont tg t® t 3 the 

jB*a*enc® of a n**e camsOimted chain reason is indicted). 

15ao system ©f a^haMsation permits one qjsiekly asd mally to extract 
scooted knowledge to cohere different tmtm** to amJyxe data, eteo 

Jfeuerer, for the: of this system it is necessary to solve 

arany difficult prdbl^aBo 

lo rsnea.-teion of an Efficient Well^Defing^ ggtmjO^^jM 

o f TnfoannatiQKo 

Science, having generated this pr^blma, has also prepared 
the means for its aolutiono It is snmsimfe, to example* to 
rerasriber the existence of chemical formulae* which characterise 
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stsuctw® of 'wtetaacps o Tbs of distensions perarftte 

°°® to ex&nam different physical vaJtom by means of a null 
nssnttes-' of teste quantttleoa ®*wa P X<gsr 'the ©to®®Bct®rtLatics of 
pfieaaraiieiia studied to Mechanics* length (h } 9 mm s (M) and ttau (T) 
aifjy fee a$ toes, and then the dimgsagioaj of laechaat cal 

■vnhim apg«m? to tbe follawirws toms force o'.'was dVi, 
imlTOtty — CUP"®), teistty (HL* 3 ), tosss — * (An?* 3 ), «tc„ 

3to* th® characteristics of el^tremxig^t&e to ttea® 

te*i*.te TOlr^s 1# added, the fourth mime «•» dietoctKto strength C^) 
w mgasrfete ^jermenhilAty 

Ttom the wrIa "electrical field totality 4 ' ©ay }m witten 
the fotflaato of* dlms-noloa {l* t^T if in a text a .‘Usfojp* 

n&tlon is expected to teress of w eieetr«citai')r« foapce" # "pfeaMai 1 ’ ^ 
”tensi©tt% "TOlt&fp’ 8 (the latter t$ra is eltoda&ted}; than tte«g ?= 

®y «U expressed identically fey the fosnilAs => 

*SSae foss^hits® of a tinlqjjw diction&sy vhieh gensr&lls&s the 
new existing xunerais specialised dictionaries would permit one 
t<C’ find iafomatisn to the asost unexpected places <> 

®** pa am mm&ms veUriasma examples ^ when "sar discoveries* 
to oa® field have leng teen utilised in motimro For example^ 
tb3 feedback in the mechanical oqpUfieft of a goranor 
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of a steam engine baa used to tnerecuie stability for about 
80 ymxw? but for electronic tube aityli£lera $ it ’was ! ' ’ rediscovered" 
just in to© 1930° s* aM only 5 years later «»» for smtpatic 
amplifiers 0 

tos^eracy toward m-3dmm gpngxalisattoai a*iate in ©very 
field ©f scie»s@o Sigaifieasfe acMeveaieats in tMs direct!*® 
ware u»ade 5 for example, in the theory of oscillation eTsveloped 
by Soviet pbymieiatoo For the ganeotalisatioa of mates tal the 
eaqpejrlenae of tho theory of similarity and analogy of .physioa-l 


Bite method of analogies is baaed y® this condition which 
Yo X. Lenin brilliantly sfesenmd ««» "fte unity of mtasre appears 
In ^atriMng an&l&gy'' of differential eqjuBtlms pertaining to 
diffeara&t fields ©f phenam'ssa" o 

ffis® nwtoemtie&l m&lof&m of meehaaieal* acoustic, by- 
dgearlle and otter phenamesaa ax® widely kimm* Ao Ho Krylov in- 
dies, ted that etch "amlogtcas hetman ptofolew of totally different 
fields, tout leading to identical differential ©qj^tissss, rmy be 
numerous a JferaM it have 1 seaaed possible to tero found, anything 
in earaas®, between a calculation' of movement of heavenly bodies 
under to© effect ®f attraction to the sun and among toarasalves 
and to® railing of a ship on the heaving sea, m ? between the 
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cffiapitetian ©f the so-called sector Inatpslitles In the m-sve^ 
mat of heavenly bodies and •Hie torsional ©seiX&atione of the 
drive shaft of a imjlMcy Linder diesel engine sperating on the 
ship propeller @r m an electric gaaserat ear? Msa nwh±Xe 5 if such 
a formula and eq,mti©c3 without words can be written then it is 
impossible to distinguish which of these questions is being 
resolved 2 the Qqmtdfmm are ©ne and the same”.* 

CmseqsaeJifebsra -Hi© presence of amlsgieg in thee© diverse 
physical phenomena permits sue to describe them in the following 
f&EUg 

a a by a system ©£ general eqpatt©oa 
ho by dSaaemiacal fanoute© of the mtacss found 
Co by iSnssssionless valiaes (fey criteria of similarity) 
do by a sesles ©f elementary proposltisms^ indicating the 
purpose and the results of the research or developments o 

In jxmy cases the mset precis© and concise may 

be obtained by m©e*® of mathaaatieal f@rmia@o ails symbolic 
economical fossa @f recording different concepts has undergone 
a greet ©tenge and ©smtirases t© dsvslspo 
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For esaraplg’f) the recording of an algebraic equation 

ax 3 fc too years ago would have apjseaxd thus X adjust A 
plaraan X acqjsafeBr B aslidOo 

At present there is an apparatus for symbolic recording of 
the moss complicated logical conceptions^, operations aM ©@n» 
elusions (fear example * algebra ©f logic ) 0 

lb® ufeiMaatlm ©£ the arsenal ©f mathematical means gives 
excellent results o Ube experience fram the theory of similarity^, 
the introduction of dimensionless (jaanteitlas (criteria of 
similarity); f@? emlas.ti.co of the quantities found* in e©m~ 
parts®® with single basic units is also very useful 0 

We should recall the great experience gained in the clear 
fmmuMMrjSi ®f statsneasts in patent matters <? It is well kmnm 
that It Is necessary to write the formula ®f an inventtan in the 
farm of a list &t a number of different el mxmZagy psso§esit&3©s* 
In which each proposition In expressed^, m t® speaks "withsat 
drawing a breath" <> 

If all thee® efforts la different fields were added u© and 
developed* then the result obtained will have great independent 
scientific significance « 
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Oiie development of the technique of scientific information, 
according to the ideas of Ao M, Meam eyanov, regpires the creation 
of a theoretical basis for the generalisation of information 
vfeich logically is the next stage of the davelqpaent of a 
theory of similarity and analogy of phenomena* successful 

solution of this problem will lead to a still more efficient util- 
ization of the position of dialectical materialism concerning the 
reflection of the unity of nature in the development of science, 
to the strengthening of the comm and interconnected bonds among 
the different fields of science * 

The Development of a lteti<raal ^ 

If the recording system answers the question of what to look 
for, then the system of classification indicates where to look for 

the material o 

If the machine method permitted, for example, a time of the 
order of 10 - 20 minutes "to look through" all of the accumulated 
material, then, in general, the task of the creation of a system 
of classification would recede » 

Regrettably, the quantity of informational data is so great, 
that with the greatest speed of review it will be impossible to 
look through all of the material in 10 - 20 minutes o 
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Two preliminary calculations show that if during one year 
100*000 abstracts were entered Mid each of thm. were on the average, 
fosmuLated of 100 elementary sentences of 10 words each, then 
this would amount to 100 million words a yearo 2br iJhe review 
of such a quantity of material in 10 minutes is required around 
1/100,000 of a second per words 

Rxr the review of material of 10 years input* the speed of 
scan would have to be increased still more and brought up to a 
millionth part of a second per word* or the time of scan would 
have to be increased^ 

It is understood, in many of the important cases* to avoid 
undesirable omission of material* it perhaps will be necessary to 
Increase time and require scanning of all the acaamilated infer- 
mation according to given criteria^ 

In Idle majority of cases the scope of scan my be limited „ 

If information concerning cutting tools of lathes is required* 
then It is hardly worth looking for It in limiuture concerning 
electronic oscillographs o Iherefore* it will be expedient to 
introduce some flexible system of classification for facilitating 
and accelerating searches of material 0 
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In this connection there would probably have to be worked 
out a new variation of classification for a period of time, 
during which it will be expedient to find seme general method 

of indexing material <> 

Classification by machine — this system of addresses in a 
canmutator most be constructed so that it would be relatively easy 
to change with a change of classificationo 

Inasmuch as the machine method is a fast method of search, 
the system must be relatively "consolidated", ®*e structural 
plan of classification will, probably, have the form of "a tree 
of knowledge" with a different number of "branches" in the 

divisions o 

In programming a search of information, it is possible to 
list a relatively large maiber of "aatoeesee" of those "branches”. 

In which the presence of material is assumedo 

The contemporary technique of ccmrautation peraits the estab- 
lishment for each division of information of not one, but several 
parallel "addresses",, If, for example, it is known that given 
information at the same time is of interest to chemistry, to physics, 
to biology, ae well as to mensuration technique, then it is possible 
that this information win be recorded in an "address" under all 
the divisions of these fields of science connected by mutual 

interests 0 
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The te chnical side of the question does not affect us 0 The 
difficulties here are great, but the contemporary technique is 
capable of coping with theaio 

The development of such a mechanical technique offers serious 
scientific and engineering interests, because Idas results obtained 
may also find wide application in automation, telemechanics and 
cccammicationso The problem presented is very perspective 0 
Be caus e of its significance and its general character it must be 
solved in the Academy of Science of MSSRo 
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